KP-Miner: A keyphrase extraction system for English and Arabic documents

نویسندگان

  • Samhaa R. El-Beltagy
  • Ahmed A. Rafea
چکیده

Automatic key phrase extraction has many important applications including but not limited to summarization, cataloging/indexing, feature extraction for clustering and classification, and data mining. This paper presents the KP-Miner system, and demonstrates through experimentation and comparison with existing systems that the it is effective in extracting keyphrases from both English and Arabic documents of varied length. Unlike other existing keyphrase extraction systems, the KP-Miner system has the advantage of being configurable as the rules and heuristics adopted by the system are related to the general nature of documents and keyphrase. This implies that the users of this system can use their understanding of the document(s) being input into the system, to fine tune it to their particular needs.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

KP-Miner: Participation in SemEval-2

This paper briefly describes the KP-Miner system which is a system developed for the extraction of keyphrases from English and Arabic documents, irrespective of their nature. The paper also outlines the performance of the system in the “Automatic Keyphrase Extraction from Scientific Articles” task which is part of SemEval-2.

متن کامل

Likey: Unsupervised Language-Independent Keyphrase Extraction

Likey is an unsupervised statistical approach for keyphrase extraction. The method is language-independent and the only language-dependent component is the reference corpus with which the documents to be analyzed are compared. In this study, we have also used another language-dependent component: an English-specific Porter stemmer as a preprocessing step. In our experiments of keyphrase extract...

متن کامل

روش جدید متن‌کاوی برای استخراج اطلاعات زمینه کاربر به‌منظور بهبود رتبه‌بندی نتایج موتور جستجو

Today, the importance of text processing and its usages is well known among researchers and students. The amount of textual, documental materials increase day by day. So we need useful ways to save them and retrieve information from these materials. For example, search engines such as Google, Yahoo, Bing and etc. need to read so many web documents and retrieve the most similar ones to the user ...

متن کامل

CollabRank: Towards a Collaborative Approach to Single-Document Keyphrase Extraction

Previous methods usually conduct the keyphrase extraction task for single documents separately without interactions for each document, under the assumption that the documents are considered independent of each other. This paper proposes a novel approach named CollabRank to collaborative single-document keyphrase extraction by making use of mutual influences of multiple documents within a cluste...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Inf. Syst.

دوره 34  شماره 

صفحات  -

تاریخ انتشار 2009